How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?
نویسنده
چکیده
When sufficient labeled data are available, classical criteria based on Receiver Operating Characteristic (ROC) or Precision-Recall (PR) curves can be used to compare the performance of unsupervised anomaly detection algorithms. However, in many situations, few or no data are labeled. This calls for alternative criteria one can compute on non-labeled data. In this paper, two criteria that do not require labels are empirically shown to discriminate accurately (w.r.t. ROC or PR based criteria) between algorithms. These criteria are based on existing Excess-Mass (EM) and Mass-Volume (MV) curves, which generally cannot be well estimated in large dimension. A methodology based on feature sub-sampling and aggregating is also described and tested, extending the use of these criteria to high-dimensional datasets and solving major drawbacks inherent to standard EM and MV curves.
منابع مشابه
Assessment Methodology for Anomaly-Based Intrusion Detection in Cloud Computing
Cloud computing has become an attractive target for attackers as the mainstream technologies in the cloud, such as the virtualization and multitenancy, permit multiple users to utilize the same physical resource, thereby posing the so-called problem of internal facing security. Moreover, the traditional network-based intrusion detection systems (IDSs) are ineffective to be deployed in the cloud...
متن کاملImpact of linear dimensionality reduction methods on the performance of anomaly detection algorithms in hyperspectral images
Anomaly Detection (AD) has recently become an important application of hyperspectral images analysis. The goal of these algorithms is to find the objects in the image scene which are anomalous in comparison to their surrounding background. One way to improve the performance and runtime of these algorithms is to use Dimensionality Reduction (DR) techniques. This paper evaluates the effect of thr...
متن کاملImproving the RX Anomaly Detection Algorithm for Hyperspectral Images using FFT
Anomaly Detection (AD) has recently become an important application of target detection in hyperspectral images. The Reed-Xialoi (RX) is the most widely used AD algorithm that suffers from “small sample size” problem. The best solution for this problem is to use Dimensionality Reduction (DR) techniques as a pre-processing step for RX detector. Using this method not only improves the detection p...
متن کاملBehavior Analysis Using Unsupervised Anomaly Detection
The detection of anomalous behavior in log and sensor data is an often requested task for many data mining applications. If there are no labels available in the dataset as in many real-world setups, unsupervised anomaly detection would be the method of choice. Since these algorithms are not directly applicable on the data in general, an appropriate transformation has to be performed first. This...
متن کاملSituation Awareness in Colour Printing and Beyond
Machine learning methods are increasingly being used to solve real-world problems in the society. Often, the complexity of the methods are well hidden for users. However, integrating machine learning methods in real-world applications is not a straightforward process and requires knowledge both about the methods and domain knowledge of the problem. Two such domains are colour print quality asse...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1607.01152 شماره
صفحات -
تاریخ انتشار 2016